Goto

Collaborating Authors

 Central Macedonia



Variational decomposition autoencoding improves disentanglement of latent representations

Ziogas, Ioannis, Shehhi, Aamna Al, Khandoker, Ahsan H., Hadjileontiadis, Leontios J.

arXiv.org Machine Learning

Understanding the structure of complex, nonstationary, high-dimensional time-evolving signals is a central challenge in scientific data analysis. In many domains, such as speech and biomedical signal processing, the ability to learn disentangled and interpretable representations is critical for uncovering latent generative mechanisms. Traditional approaches to unsupervised representation learning, including variational autoencoders (VAEs), often struggle to capture the temporal and spectral diversity inherent in such data. Here we introduce variational decomposition autoencoding (VDA), a framework that extends VAEs by incorporating a strong structural bias toward signal decomposition. VDA is instantiated through variational decomposition autoencoders (DecVAEs), i.e., encoder-only neural networks that combine a signal decomposition model, a contrastive self-supervised task, and variational prior approximation to learn multiple latent subspaces aligned with time-frequency characteristics. We demonstrate the effectiveness of DecVAEs on simulated data and three publicly available scientific datasets, spanning speech recognition, dysarthria severity evaluation, and emotional speech classification. Our results demonstrate that DecVAEs surpass state-of-the-art VAE-based methods in terms of disentanglement quality, generalization across tasks, and the interpretability of latent encodings. These findings suggest that decomposition-aware architectures can serve as robust tools for extracting structured representations from dynamic signals, with potential applications in clinical diagnostics, human-computer interaction, and adaptive neurotechnologies.


Balanced Online Class-Incremental Learning via Dual Classifiers

Wen, Shunjie, Heinis, Thomas, Choi, Dong-Wan

arXiv.org Artificial Intelligence

Online class-incremental learning (OCIL) focuses on gradually learning new classes (called plasticity) from a stream of data in a single-pass, while concurrently preserving knowledge of previously learned classes (called stability). The primary challenge in OCIL lies in maintaining a good balance between the knowledge of old and new classes within the continually updated model. Most existing methods rely on explicit knowledge interaction through experience replay, and often employ exclusive training separation to address bias problems. Nevertheless, it still remains a big challenge to achieve a well-balanced learner, as these methods often exhibit either reduced plasticity or limited stability due to difficulties in continually integrating knowledge in the OCIL setting. In this paper, we propose a novel replay-based method, called Balanced Inclusive Separation for Online iNcremental learning (BISON), which can achieve both high plasticity and stability, thus ensuring more balanced performance in OCIL. Our BISON method proposes an inclusive training separation strategy using dual classifiers so that knowledge from both old and new classes can effectively be integrated into the model, while introducing implicit approaches for transferring knowledge across the two classifiers. Extensive experimental evaluations over three widely-used OCIL benchmark datasets demonstrate the superiority of BISON, showing more balanced yet better performance compared to state-of-the-art replay-based OCIL methods.


Colliding with Adversaries at ECML-PKDD 2025 Model Robustness Competition 1st Prize Solution

Stefanopoulos, Dimitris, Voskou, Andreas

arXiv.org Artificial Intelligence

This report presents the winning solution for Task 2 of Colliding with Adversaries: A Challenge on Robust Learning in High Energy Physics Discovery at ECML-PKDD 2025. The goal of the challenge was to design and train a robust ANN-based model capable of achieving high accuracy in a binary classification task on both clean and adversarial data generated with the Random Distribution Shuffle Attack (RDSA). Our solution consists of two components: a data generation phase and a robust model training phase. In the first phase, we produced 15 million artificial training samples using a custom methodology derived from Random Distribution Shuffle Attack (RDSA). In the second phase, we introduced a robust architecture comprising (i)a Feature Embedding Block with shared weights among features of the same type and (ii)a Dense Fusion Tail responsible for the final prediction. Training this architecture on our adversarial dataset achieved a mixed accuracy score of 80\%, exceeding the second-place solution by two percentage points.


Evaluating Small Vision-Language Models on Distance-Dependent Traffic Perception

Theodoridis, Nikos, Brophy, Tim, Mohandas, Reenu, Sistu, Ganesh, Collins, Fiachra, Scanlan, Anthony, Eising, Ciaran

arXiv.org Artificial Intelligence

Vision-Language Models (VLMs) are becoming increasingly powerful, demonstrating strong performance on a variety of tasks that require both visual and textual understanding. Their strong generalisation abilities make them a promising component for automated driving systems, which must handle unexpected corner cases. However, to be trusted in such safety-critical applications, a model must first possess a reliable perception system. Moreover, since critical objects and agents in traffic scenes are often at a distance, we require systems that are not "shortsighted", i.e., systems with strong perception capabilities at both close (up to 20 meters) and long (30+ meters) range. With this in mind, we introduce Distance-Annotated Traffic Perception Question Answering (DTPQA), the first Visual Question Answering (VQA) benchmark focused solely on perception-based questions in traffic scenes, enriched with distance annotations. By excluding questions that require reasoning, we ensure that model performance reflects perception capabilities alone. Since automated driving hardware has limited processing power and cannot support large VLMs, our study centers on smaller VLMs. More specifically, we evaluate several state-of-the-art (SOTA) small VLMs on DTPQA and show that, despite the simplicity of the questions, these models significantly underperform compared to humans (~60% average accuracy for the best-performing small VLM versus ~85% human performance). However, it is important to note that the human sample size was relatively small, which imposes statistical limitations. We also identify specific perception tasks, such as distinguishing left from right, that remain particularly challenging for these models.


Heterogeneity in Multi-Robot Environmental Monitoring for Resolving Time-Conflicting Tasks

York, Connor, Madin, Zachary R, O'Dowd, Paul, Hunt, Edmund R

arXiv.org Artificial Intelligence

Multi-robot systems performing continuous tasks face a performance trade-off when interrupted by urgent, time-critical sub-tasks. We investigate this trade-off in a scenario where a team must balance area patrolling with locating an anomalous radio signal. To address this trade-off, we evaluate both behavioral heterogeneity through agent role specialization ("patrollers" and "searchers") and sensing heterogeneity (i.e., only the searchers can sense the radio signal). Through simulation, we identify the Pareto-optimal trade-offs under varying team compositions, with behaviorally heterogeneous teams demonstrating the most balanced trade-offs in the majority of cases. When sensing capability is restricted, heterogeneous teams with half of the sensing-capable agents perform comparably to homogeneous teams, providing cost-saving rationale for restricting sensor payload deployment. Our findings demonstrate that pre-deployment role and sensing specialization are powerful design considerations for multi-robot systems facing time-conflicting tasks, where varying the degree of behavioral heterogeneity can tune system performance toward either task.


Explainable Anomaly Detection for Industrial IoT Data Streams

Paupério, Ana Rita, Risca, Diogo, Lourenço, Afonso, Marreiros, Goreti, Martins, Ricardo

arXiv.org Artificial Intelligence

Industrial maintenance is being transformed by the Internet of Things and edge computing, generating continuous data streams that demand real-time, adaptive decision-making under limited computational resources. While data stream mining (DSM) addresses this challenge, most methods assume fully supervised settings, yet in practice, ground-truth labels are often delayed or unavailable. This paper presents a collaborative DSM framework that integrates unsupervised anomaly detection with interactive, human-in-the-loop learning to support maintenance decisions. We employ an online Isolation Forest and enhance interpretability using incremental Partial Dependence Plots and a feature importance score, derived from deviations of Individual Conditional Expectation curves from a fading average, enabling users to dynamically reassess feature relevance and adjust anomaly thresholds. We describe the real-time implementation and provide initial results for fault detection in a Jacquard loom unit. Ongoing work targets continuous monitoring to predict and explain imminent bearing failures.


DEFEND: Poisoned Model Detection and Malicious Client Exclusion Mechanism for Secure Federated Learning-based Road Condition Classification

Liu, Sheng, Papadimitratos, Panos

arXiv.org Artificial Intelligence

Federated Learning (FL) has drawn the attention of the Intelligent Transportation Systems (ITS) community. FL can train various models for ITS tasks, notably camera-based Road Condition Classification (RCC), in a privacy-preserving collaborative way. However, opening up to collaboration also opens FL-based RCC systems to adversaries, i.e., misbehaving participants that can launch Targeted Label-Flipping Attacks (TLFAs) and threaten transportation safety. Adversaries mounting TLFAs poison training data to misguide model predictions, from an actual source class (e.g., wet road) to a wrongly perceived target class (e.g., dry road). Existing countermeasures against poisoning attacks cannot maintain model performance under TLFAs close to the performance level in attack-free scenarios, because they lack specific model misbehavior detection for TLFAs and neglect client exclusion after the detection. To close this research gap, we propose DEFEND, which includes a poisoned model detection strategy that leverages neuron-wise magnitude analysis for attack goal identification and Gaussian Mixture Model (GMM)-based clustering. DEFEND discards poisoned model contributions in each round and adapts accordingly client ratings, eventually excluding malicious clients. Extensive evaluation involving various FL-RCC models and tasks shows that DEFEND can thwart TLFAs and outperform seven baseline countermeasures, with at least 15.78% improvement, with DEFEND remarkably achieving under attack the same performance as in attack-free scenarios.


Model-Based and Sample-Efficient AI-Assisted Math Discovery in Sphere Packing

Tutunov, Rasul, Maraval, Alexandre, Grosnit, Antoine, Li, Xihan, Wang, Jun, Bou-Ammar, Haitham

arXiv.org Artificial Intelligence

Sphere packing, Hilbert's eighteenth problem, asks for the densest arrangement of congruent spheres in n-dimensional Euclidean space. Although relevant to areas such as cryptography, crystallography, and medical imaging, the problem remains unresolved: beyond a few special dimensions, neither optimal packings nor tight upper bounds are known. Even a major breakthrough in dimension $n=8$, later recognised with a Fields Medal, underscores its difficulty. A leading technique for upper bounds, the three-point method, reduces the problem to solving large, high-precision semidefinite programs (SDPs). Because each candidate SDP may take days to evaluate, standard data-intensive AI approaches are infeasible. We address this challenge by formulating SDP construction as a sequential decision process, the SDP game, in which a policy assembles SDP formulations from a set of admissible components. Using a sample-efficient model-based framework that combines Bayesian optimisation with Monte Carlo Tree Search, we obtain new state-of-the-art upper bounds in dimensions $4-16$, showing that model-based search can advance computational progress in longstanding geometric problems. Together, these results demonstrate that sample-efficient, model-based search can make tangible progress on mathematically rigid, evaluation limited problems, pointing towards a complementary direction for AI-assisted discovery beyond large-scale LLM-driven exploration.


Incorporating Structure and Chord Constraints in Symbolic Transformer-based Melodic Harmonization

Kaliakatsos-Papakostas, Maximos, Soiledis, Konstantinos, Tsamis, Theodoros, Makris, Dimos, Katsouros, Vassilis, Cambouropoulos, Emilios

arXiv.org Artificial Intelligence

Transformer architectures offer significant advantages regarding the generation of symbolic music; their capabilities for incorporating user preferences toward what they generate is being studied under many aspects. This paper studies the inclusion of predefined chord constraints in melodic harmonization, i.e., where a desired chord at a specific location is provided along with the melody as inputs and the autoregressive transformer model needs to incorporate the chord in the harmonization that it generates. The peculiarities of involving such constraints is discussed and an algorithm is proposed for tackling this task. This algorithm is called B* and it combines aspects of beam search and A* along with backtracking to force pretrained transformers to satisfy the chord constraints, at the correct onset position within the correct bar. The algorithm is brute-force and has exponential complexity in the worst case; however, this paper is a first attempt to highlight the difficulties of the problem and proposes an algorithm that offers many possibilities for improvements since it accommodates the involvement of heuristics.